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Abstract We study in limit law the complexity of some anticipated rejec¬ 
tion random sampling algorithms. We express this complexity in terms of a 
probabilistic process, the threshold sum process. We show that, under the 
right conditions, the complexity is linear and admits as a limit law a so-called 
Darling-Mandelbrot distribution, studied by Darling (1952) and Lew (1994). 
We also give an explicit form to the density of the Darling-Mandelbrot distri¬ 
bution and derive some of its analytic properties. 
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1 Introduction 

This paper aims at answering the following algorithmic question: consider a 
program P that performs a random number of elementary operations and then 
terminates. Our goal is to have P performing n operations in one run. To do 
that, we run the program P until it reaches n operations; if it terminates before 
that, we simply restart it. The question is, how many elementary operations 
must we perform to reach this goal? 

Algorithms of this type are abundant in the field of random sampling, where 
they are known as anticipated rejection algorithms. Given a class of discrete 
objects, a random sampling algorithm takes an integer n as input and outputs 
a random object of size n according to a specific (usually uniform) distribution. 
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Given a random sampling algorithm for a class A and a subclass B of A, an 
element of B can be sampled using a rejection algorithm: we repeatedly sample 
elements of A until we find one in B. This algorithm can be improved when it 
is possible to know in advance, during the sampling procedure, that the drawn 
element is not going to be in B: we can then prematurely reject the sample and 
start over, saving computing time. This scheme is called anticipated rejection. 
Assuming that sampling an element of A costs n elementary operations, this 
fits into the framework outlined above. 

Such algorithms are found for example in (Barcucci et ah, 1994[ 19951, 
sampling prefixes of Motzkin paths (the so-called Florentine algorithm). Some¬ 
what miraculously, this algorithm achieves an average linear time complexity, 
as, on average, the number of necessary trials is O^y/n) and each trial costs 
0{^Jn). We show that this phenomenon is not isolated, but rather happens in a 
wider range of cases. Other algorithms of this family exist, sampling Schroder 


prefixes (Penaud et al. 2001), unary-binary trees (Bacher et ah, 2014) and 


constrained random walks. 

In this paper, we study the full limit distribution of the complexity of 
these algorithms. This problem leads us to define a probabilistic process, the 
threshold sum process. Our main result is that, if the base distribution has a 
tail with exponent a in a certain range, this process admits a limit distribution 
depending only on a. This universality phenomenon is reminiscent of Levy’s 
well-known theory of a-stable distributions, which also deals with sums of 


independent random variables (Gnedenko and Kolmogorov 19681. 


Surprisingly, our limiting distribution has already been studied in relation 
to a different problem, namely, the ratio between the sum and the maximum 
of a fixed number of i.i.d. random variables. It was first studied by Darling 


(Darling 

1952 

), then apparently by Mandelbrot in unpublished work, and by 

Lew ( 

Lew 

1994 

), who named it the Darling-Mandelbrot distribution. This 


distribution has a parameter a, with 0 < a < 1; it is supported on 
defined by its characteristic function: 


(j)a{s) = 


{-is)- 


—a'y{—a, —is) 


-h-i: 



and is 


( 1 ) 


where in the first expression, 7 (-, •) denotes the lower incomplete gamma func- 
tior0 The second expression allows to easily extract the moments of the dis¬ 
tribution as rational functions of a. Lew showed that the distribution has an 
exponential tail; moreover, we show that its density is non-analytic at all inte¬ 
ger points. Both properties contrast with the Levy distributions, which have 
an analytic density and a heavy tail. 

In the case of the Florentine algorithm (which corresponds to an expo¬ 
nent a = 1/2, as seen below), an expression of the Laplace transform of the 


^ Given the definition of the Gamma function r{y) = xy~^e~^dx, the upper and lower 
incomplete versions are defined through the corresponding integrals on modified domains 
r(y,z) = ■ and 'y{y,z) = r(y) — r{y,z) = fg-, respectively. Non-positive real values 

of 2 are reached by analytic continuation. 
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limit distribution already appears in (Louchard 1999), namely: 


1 

+ y/TTZeTi{y/z) 


We readily check that this expression is equivalent to ^ 1/2 (j-z), the Laplace 
transform of the Darling-Mandelbrot distribution of parameter 1/2. 

The paper is organized as follows. In Section we define the threshold 
sum process and show that, under some conditions, its limit distribution is 
a Darling-Mandelbrot distribution. In Section we give an explicit form for 
the Darling-Mandelbrot density and give analytic results expanding those of 
Lew. Finally, in Section we use these results to analyse some anticipated 
rejection algorithms. 


2 The threshold sum process 

In the following, let (Xi)j>o be a sequence of independent and identically 
distributed random variables with values in N or IR_|_ and unbounded support. 
We denote by F{x) the complementary cumulative distribution function of the 
X,’s: 

F{x) = P(W > x). 

Let t > 0 and let I{t) be the smallest index such that ^ t- Define the 
threshold sum process (TSP) Yt as: 

Yt = Xq ■ ■ ■ + 

The number t is called the threshold. This process resembles the classical sum 
of independent random variables, but the number of summands I{t) is here a 
random variable depending on the real parameter t. Our main result on this 
process is the following. 

Theorem 1 Assume that, as x tends to infinity, F(x) is equivalent to cx~°‘ 
for some c > 0 and a > 0. Then, as t tends to infinity, the random variable 
Yt satisfies: 

— if a < 1, then Yt/t converges in distribution to the Darling-Mandelbrot law 
of parameter a; 

— if a = 1, then Yt/{t log t) converges in distribution to the exponential law; 

— if a > 1, then Yt/(t°'c~^p,), where /i = E(Xi), converges in distribution to 
the exponential law. 

To us, the most interesting case is a < 1, where the behavior of Yt is 
strongly universal in that it only depends on the exponent a. Moreover, the 
scaling factor is always t in that range (this is different from Levy’s theory 
of sums of i.i.d. random variables, where the scaling factor is a power of t 
depending on a). For 0=1, the scaling factor is augmented by a logt factor; 
for a > 1, the scaling factor is higher and we have a lesser form of universality, 
with the limit scaled by p/c. Consequences of these facts to the analysis of 
algorithms are discussed in Section]^ 
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Proof We prove this result using Levy’s Continuity Theorem, which states 
that a sequence of random variables tends in distribution to some limit if their 
characteristic functions converge pointwise to the characteristic function of the 
limit distribution. 

Let ipt{s) = be the characteristic function of the random variable 

Yt/r, where r is a scaling factor (depending on t) to be specified later on. 
The index I{t) is geometrically distributed with parameter F{t), which is the 
probability that Xi > t. The random variables Xq, ..., Xj(^f'j_i are constrained 
to be less than t; let Xt(s) = < t) be the characteristic function of 

such a constrained variable. We have: 

Ms) = - 7 --■ 

1- (l-F(t))xt(s/r) 

On the other hand, we have: 


^ ’ n=0 


We therefore have: 

Ms) = 


m 


^ Mt^n [isY 

2-^ r" n\ 

n—0 


with 


Mt,n = 


^dF{a 


1-E 


1 

T^F{t) n\ 


where the last simplification follows from the fact that Mt^ = 1 — F{t). 

Consider first the case where a < 1. Using integration by parts, we find 
that the term Mt^n satisfies as t tends to infinity: 

pt 

Mtn = -t"F(t)+ nx^-^F{x)dx - — ct^-^. 

Jo n-a 

Moreover, as F is nonincreasing, we have a bound F{x) < c'x~°‘ for some 
constant F. This enables us to dominate by: 

77 

<-c'r-“. 

n — OL 

Picking r = t, a dominated convergence argument and the expression Q 
therefore show that the characteristic function iptis) tends to the characteristic 
function ^q(s) of the Darling-Mandelbrot distribution. We conclude using 
Levy’s theorem. 

If a = 1, we have ~ clogt as t tends to infinity; if a > 1, Mt i tends 
to the finite value /r. This means that the ratio Mt^i/[TF{t)] tends to 1 with 
the respective values r = tlogt and r = Moreover, in both cases, all 

the higher moments satisfy Mt^n = 0{F~^) and are therefore negligible before 
T'^F{t). This means that iptis) satisfies: 

Ms) W-—) 

1 — IS 

which is the characteristic function of the exponential distribution. 















Complexity of anticipated rejection and the Darling—Mandelbrot distribution 


5 


3 The Darling—Mandelbrot density 


This section is devoted to the computation and the derivation of properties of 
the density, denoted by g, of the Darling-Mandelbrot distribution. By studying 
the Laplace transform, Lew (Lew 1994 1 determined that 5 is a continuous 
function satisfying: 


g{x) = Cox°‘ ^ + 0(1), a: —>■ O'*"; 

5(2:) = —a; ^00, 
a 

where Cq is the constant: 

_ sin(Q;7r) _ 1 

° TT T(i — a)r{a) 


( 2 ) 

( 3 ) 

( 4 ) 


and where —Oq is the real zero of the function z 1 — >■ z“ 7 (—a, z) and Oi > Oq 
( see the reference for details). 


3.1 Explicit forms of the density 


Theorem 2 Let 0 < a < 1. The Darling-Mandelbrot density g(x) is equal to: 

00 

9{x)=^gk{x), (5) 

fc =0 

where the function gk{x) is continuous for x > 0, supported for x > k, analytic 
on its support, and has the two following equivalent definitions. 

— Let a(x) and b{x) be the functions, supported for x > 0 and x > 1 respec¬ 
tively, defined by: 

a{x) = Cox°‘~^ and b{x) = —Cq— —(6) 

X 

where Cq is defined by (|^. Then gk{x) is equal to the convolution product: 

gkjx) = a=¥ b* - - * bX x). (7) 

k times 


— Let: 

&=o + Ml + o) and = ® 

Then gk{x) has the power series representatioi^ convergent for k < x < 
k 1 and analytically continuable for x > k -\-1: 


gk{x)=Ck{x-kf>‘-^ Y, [k-xr-+-+-\ (9) 

\Pk)n-\-\ - \-nk 

ni,...,nfc>0 / 1 I IK 


^ The sum in this expression can be seen as a special case of the Lauricella function 
where all variables are specialized to —x. 


(k) 
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where {y)n = r{y + n)/r{y) is the Pochhammer symbol for the rising 
faetorial. 

Again, we make some remarks before proving the theorem. First, since the 
summand gkix) has support for x > k, the infinite sum in ([^ is locally finite, 
which justifies its existence and shows that g is continuous. Moreover, since 
b{x) is negative, the summands alternate in sign. In particular, if a; < 1, we 
have g{x) = go{x) = CqX°‘~^. This shows that the error term in is in fact 
zero in that range. In the case fc = I, the sum in takes the form of a 
hypergeometric function: 

gi{x) = Ci{x - 1)^“ 2Fi{1, 1 + a; 1 + 2a; 1 - x). 

For a = 1/2, this simplifies into: 

gi{x) = -. 

TT 

The theorem also enables us to find the singularities of the density g{x). 
Since the leading term in the sum in Q is 1, the function gk{x) has a singu¬ 
larity at k of the form: 

gk{x) = Ck{x - la;>fe -t 0{{x - k)^>‘). 

Moreover, as the function gk(x) is analytic for x > k and the sum ([^ is locally 
finite, the density g{x) is singular at all integer points and analytic otherwise, 
the singularity at the point x = k being contributed by gkix). 

Finally, we note that although the sum in (§ behaves very well locally 
(indeed, it’s locally finite), it’s not the case globally: as x tends to infinity, 
gkix) behaves like and so alternately tends to ±oo for k sufficiently 

large. Yet, as found by studying the Laplace transform, the sum converges 
exponentially fast to zero. 

In order to plot the density gix), the most adequate characterisation is 0. 
or better yet, the differential equation of the forthcoming Theorem which 
was used to produce Figure 


.6 

.5 

.4 

.3 

.2 

.1 

0 



Fig. 1 Plots of the density g{x) for a = 1/4, a = 1/2 and a = 3/4 (from left to right). 
Dashed, the continuation of the partial sums po + ‘ beyond x = k + 1. The precision 

is far beyond line thickness (as easily obtained through the characterisation of Theorem]^. 
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Proof (of Theorem Let us first prove the convolution product representa¬ 
tion. From the identity 0 we find the Laplace transform of g{x)^ that we 
denote by G{z)-. 


G{z) = 4>a{iz) 



z)' 


We transform this into: 


G{z) 


—a[r{—a) — r{—a, z)) 


Y^A{z)B{z)\ 

k>Q 


where is the upper incomplete gamma function and: 


A{z) 


r{l-a) 


and 


B{z) 


r{-a,z) 

r{-a) 


(if z is large enough so that \B{z)\ < 1; numerically, fHe(z) > 0.107878... 
suffices, uniformly for all a). 

Noting that F(1 — a)r{a) = —r{—a)r{l + a), the following elementary 
computations show that the functions a{x) and b{x) defined in ([^ have Laplace 
transforms A{z) and B{z), respectively: 


x°‘-^e-^^dx = / x°‘-^e-^dx 

Jo 

= z-^r{a)-, 

pOO poo 

J J {x — \)°‘e~^'^dxdy 

pOO 

= / + a)e~^dy 

J Z 

= r{-a,z)r{l + a). 


[x - 1)“ 


Inverse Laplace transform thus yields Q. The function gk{x) is analytic for 
X > k as the convolution product of analytic functions. 

Let us now prove the power series representation. Let 1 < a: < 2. A Taylor 
expansion of the function b{x) yields: 


b{x) 


1 


E(-i) 

n>0 


(x- 1)“+" 
r(i + a) ■ 


The identity ([^ then follows from Q using the classic formula: 

* ... * - 

Jri Jrk JriS - hrk ’ 




r(a) 


( 10 ) 


Finally, since gk{x) is analytic for x > k, its value for x > fc -|- 1 is found 
by analytic continuation. 
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3.2 Differential equations satisfied by the density 


In this section, we characterize the density g not explicitly, but implicitly as 
the solution of differential equations. Since g is singular at all integer points, 
all differential equations are understood to be satished only outside singular 
points. 

Theorem 3 The density g{x) is the only continuous solution of the non-linear 
differential equation: 

xg'{x) + (1 - a)g{x) = -ag * g{x - 1), (11) 

with initial condition ([^. 

As the density g is positive, this result shows in particular that g is de¬ 
creasing. In fact, the equation above can be rewritten as 

^ {x^~°‘9{x)) = -ax°‘g * g{x - 1). 

This makes evident the stronger statement that x^~°‘g{x) is nonincreasing. 
This answers a question of Lew, who suggested that g{x) might show oscilla¬ 
tions for small values of a. 


Proof Let us prove that g satishes the equation. One way to proceed is to 
differentiate the Laplace transform G{z). One can also directly use the repre¬ 
sentations of Theorem Another way, that we detail here, is to compare the 
threshold sum processes at thresholds t and u, with u > t. We have: 


Yt if X^t) > u] 

Yt Xj(j.-j +Yf if t < X/p) < M, 


where Yf is independent from Yf and distributed like T„. 

Now, set u = Xt and let t tend to infinity. The event Al/p) > u occurs with 
probability F{u)/F{t) —)■ A““. If it does not, we have Xj(^t) = t 0(X — 1). 
Dividing by t and recalling that Yt/t tends to the law of density g, we get: 


A- 


^g(A = A °‘g{x) -F (1 - A 


'‘){g*g{x-l) + 0{X-l)). 


We recover ( |11[ ) at first order in A — 1. 

To show the uniqueness of the solution, we note that the right hand side of 
(111 depends only on the values of g{y) for y < x — 1; in particular, it is zero for 
X < 1. This enables us to solve iteratively the equation on the intervals [A:, fc-l-l], 
treating the equation as an inhomogenous linear differential equation, with the 
initial value f{k) found by continuity. This determines the solution uniquely. 


Our final result writes the density g as the solution of linear differential 
equations. Write dx = d/dx and let and Ej. be the differential operators: 


Dk = dx{x -k) - {k-{- l)a; 


Ek = Dk-i ■ ■ ■ Dq. 
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Theorem 4 The operator Ek cancels the functions go,..., gk-i defined in 
Theorem^ In particular, it cancels the density g on the interval [0, fc]. 

Proof To prove the theorem, we need the following elementary facts about 
convolution products: 

x{u * v) = {xu) * V + u* (xv); (u * v)' = u' * v. 


We first prove by induction that, for 0 < £ < A:, we have: 

7 -^ A:! , 

where 

^ ^ _ (x- 

“ r(i-a)r(-ayr((£ + i)a) 

For £ = 0, this is obvious as oq = a. Otherwise, assume that the identity is 
true at rank £ and apply the operator Du to it. Using the above properties of 
convolution products, we have: 

Ei+x ■ gk = + {k-^£- 1)! °^ * 


Since Di annihilates ai, we conclude using the fact that ai*{xb)' = ae+i found 
using formula (10). 

At £ = k, we thus find Ek ■ gk = k'. Ok- Since Dk ■ Ok =0, we have indeed 
El ■ gk = 0 ioT £ > k. 


4 Applications 

In this section, we apply our results to the analysis in limit law of random 
sampling algorithms. In all cases, this complexity is linked to a threshold sum 
process that falls within the conditions of Theorem[2 Among the three regimes 
in this theorem, the most favorable is the first one, with the scaling factor t 
meaning that the algorithm has linear complexity. 

In the following, we consider an anticipated rejection algorithm based on a 
process with survival probability at time t asymptotic to ct~°‘; the algorithm 
consists in running the process repeatedly until it reaches time t. Since the 
successful run takes time t, the complexity normalized by t follows a Darling- 
Mandelbrot distribution shifted by one, with characteristic function 
(this coincides with Darling’s initial dehnition). We denote by D{a) this shifted 
distribution. 

In some cases, the algorithm has a second round of rejection on top of 
anticipated rejection, i.e., it may fail and be restarted upon reaching the tar¬ 
get t. Let us assume that it succeeds with a fixed probability p. The overall 
complexity of the algorithm is then of the form Yi + ■ ■ ■ + Yz, where the 
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li’s are independent variables following the law Via) and Z > 1 is geomet¬ 
rically distributed with parameter p. Let 'D{a,p) denote such a distribution 
and e*®(/>a_p(s) be its characteristic function. We have: 




P(t>a{s) 

1 - (1 -p)e**^a(s) 



n — a n\ 


-1 


( 12 ) 


This situation typically arises when each step of the algorithm consists in 
growing the sampled object by an increment Si,..., with respective proba¬ 
bilities pi,... ,pk- In this case, there is a possibility that the sample misses the 
target size t by hopping over it. In the aperiodic case (where si A • • • A Sfc = 1), 
this occurs with an asymptotic probability p = 1/S where S = 
drift of the process. Slightly more subtle is the situation in which the sfs are 
not all non-negative (but still the drift is positive), an eventuality discussed in 
Section [T4| Examples are detailed below. 

Let y be a random variable following the distribution 'D{a). The moments 
of Y can be recovered by Taylor expansion of the expression Q multiplied 
by e*'*. In particular, we have: 


E(y) = 


v(y) 


a 

(1 — 0)^(2 — a ) 


As convergence in distribution implies convergence of moments, this will enable 
us to compute the asymptotic behavior of the moments of the complexity of 
the algorithms. The distribution 'Dia, p) can be treated in the same way using 
pi). This yields: 


E(y) = 


1 

p{l - a)’ 


v(y) 


a -b 2(1 — p)(l — a) 
p2(l — a)^(2 — Of) 


4.1 Prefixes of Motzkin paths and directed animals 


The simplest algorithm that fits in our framework is probably the one described 
in ( jBarcucci et al. , [1994 ), which samples prefixes of Motzkin paths {i.e., lattice 
paths with steps in {/',\(,—t} never stepping lower than their origin). Using 
a bijection of Penaud, they thus get a random sampling algorithm for directed 
animals. A generalization appears in (Barcucci et al. 1994), which deals with 


the case where there are several possible steps of each type (colored Motzkin 
prefixes). 

The algorithm is very simple: the path is built by adding random steps one 
at a time. If, at any time, the path steps below the origin, the algorithm is 
started over from scratch. If the target size n is reached, the path is output. 
To our knowledge, this is the best known algorithm for exactly sampling such 
structures, with the exception of the special case in which there is no —>■ step 
{i.e., prefixes of Dyck pat/is)|^ 


^ To sample these , a better (in fact, optimal) algorithm consists in using the algorithm of 
jBacher et al.[ |2014[ l to sample a pointed binary plane tree and using classical bijections to 
get a Dyck prefix. 
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Proposition 5 Assume the number of possible and steps is the same. 
Let Yn he the number of steps drawn by the algorithm to sample a path of 
length n. As n tends to infinity, the random variable Y^fn tends in distribution 
to the law . 

In particular, we recover estimates of the expected value and variance given 
by Barcucci et al., namely: 

E(y„) ~ 2n; V(r„) ~ 

Proof Let k be the total number of available steps. If there are as much differ¬ 
ent and steps, the number 77 i„ of Motzkin prefixes of length n satisfies 
~ where c is a constant. 

Let X be the random variable counting the number of steps before a ran¬ 
dom path goes below the origin. We have X > n if and only if the first n steps 
form a Motzkin prefix, which happens with probability m„/fc" ^ 

As outlined above, the random variable Yn onsists of two parts: the cost 
of the unsuccessful trials, which follows a threshold sum process with base 
distribution X and threshold n, and the cost of the final successful trial, which 
is n. By Theorem[^ the quotient Yn/n thus tends to the shifted law 


4.2 Prefixes of Schroder paths 


A variant of the previous algorithm, sampling prefixes of Schroder paths, is 
found in (Penaud et al. 2001 1 . A Schroder path has the same constraints as a 
Motzkin path and takes steps in {/^, \, —(where —i—>■ has length 2). As 
shown in (Bacher 2014), these paths are also in bijection with directed lattice 
animals, this time on the king’s lattice (Figure]^. 

The algorithm is similar to the one above, but the steps are 

taken with respective probabilities p, p, with p = — 1. There is another 

difference: when sampling for a target size n, it is possible to jump from n — 1 
to n -|- 1 by generating a —i—>. In this case, we must discard the path and start 
over. As the following result shows, this modifies slightly the limit behavior of 
the complexity while keeping it linear. 


Proposition 6 Let Yn be the total length of the steps drawn by the algorithm 
to sample a Schroder prefix of length n. The random variable Yn/n tends in 
distribution to the law T){\, )■ 


From (12), we get the expected value and variance of y„: 


E(F„) - (8 - 4v^)n; 


16 

V(T„) ~ y(l6-llv^)n^ 


Proof Let s„ be the number of Schroder prefixes of length n and Pn be the 
probability to reach one of them. As we have ^ cp~'"'n~^^‘^, we have ~ 
, where c is a constant. 
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Motzkin 


zx 





Fig. 2 Families of meanders in bijection with directed lattice animals. 


Let X be the random variable counting the length of the path sampled 
before it goes below the origin. The event X > n can occur in two ways: either 
we sample a Schroder prefix of length n or a prefix of length n — 1 followed by 
a —the probability of this is + p^pn-i ~ (1 + In the same 

way as for Proposition!^ the time necessary to reach this tends in distribution 
toV{\). 

Finally, out of the two above possibilities, we are interested only in the case 
where we draw a Schroder prefix of length n. This happens with probability 
Pn/{Pn + P^Pn-i) “t 1/(1 + p^) = (2 + \fi) j The number of times the size n 
is reached is geometrically distributed, hence the result. 


4.3 Unary-binary trees 


Another recent anticipated rejection algorithm appears in (Bacher et al. 20141 


sampling unary-binary plane trees. The algorithm works by letting a tree grow 
from size 1 to n using a grafting process akin to Remy’s algorithm for binary 
trees, based on a holonomic equation. This process may fail, however, in which 
case the algorithm is restarted. For our analysis, we use the following two facts: 
first, the probability of reaching at least the size n during the growth procedure 
satisfies ~ with c a constant; second, at each step, the tree grows 

by 1 or 2 nodes with respective probabilities 2/3 and 1/3. If this takes the size 
of the tree from n — 1 to n -I- 1, the algorithm is restarted. 


Proposition 7 Let Yn be the number of nodes of the trees built by the al¬ 
gorithm to sample a tree with n nodes. The random variable Y^jn tends in 
distribution to the law f) • 
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Again, we deduce the expected value and variance from (12): 


E(r„) ~ 



v(y„) ~ 


32 

y 


Proof The proof is identical to the one of Proposition The form of the 
probability Pn shows that the time necessary to reach size n is, normalized 
by n, distributed like Knowing we have reached at least n nodes, the 

probability to hit exactly n is Pn/{Pn + \pn-i) This concludes the proof. 


We remark that another way of sampling unary-binary trees with n vertices 
is through the classical bijection with Motzkin excursions of length n — 1. 
These are in 1-to-n correspondence with Motzkin paths of length n and ending 
at ordinate —1, which are themselves in bijection with prefixes of Motzkin 
excursions, of length n and ending at odd ordinate. Such a prefix can be 
sampled using the procedure of the previous section and a rejection scheme, 
but the probability of rejection (checking if the final ordinate is odd) is then 
asymptotically 1/2 instead of 1/4, leading to a slightly worse complexity. 


4.4 More general holonomic systems 


The algorithmic strategy outlined in (Bacher et al., 2014) is potentially amean- 
able to a variety of problems. Several combinatorial structures, with a size 
parameter n, have generating functions satisfying a holonomic equation, 
i.e., an equation of the form 


= 0 (13) 

iei 

where / is a hnite subset of Z, and Pi{n) are polynomials with rational coeffi¬ 
cients such that Po{n) ^ 0. Let d the maximal degree among the Pi’s, and pi 
the coefficient of degree d in Pi (possibly zero). Asymptotically, we have 

Y,P^Zn-^ = Zr,0{n-^). (14) 

iGl 

Suppose that the holonomic equation can be rewritten as 


= y^Pi(n)^n-z 


IGI 


(15) 


(up to a redefinition of Pq), so that the coefficients of the Pi’s are positive 


fn—i 

k 


)0We can interpret 


rationale, when Pi is written in the polynomial basis 
the fc-th basis polynomial as associated to the enumeration of objects with 
k marked unit elements. The positivity of the coefficients may prelude to the 
design of a bijective interpretation of relation (15), in which the marks undergo 


The condition on the form of the left-hand side can be relaxed to some extent, we treat 
here a simplified situation in order to lighten the notation. 
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a local dynamics, implemented with small complexity. We have an analogue 

(16) 


of equation (141, of the form 

- ^PiZn-i = ZnO{n~^) , 


iGl 


where now pi is the coefficient of in Pi. Let lim„_,.oo In De¬ 

fine the drift 6 as the average of * S /, according to the distribution 
(which is normalised). The bijection discussed above can be turned into an 


algorithm, possibly of anticipated rejection. This is what happens in (Bacher 


et al.[ 2014), for binary and unary-binary trees. In the first case the algorithm 


has no reject, in the second case anticipated rejection is required. Anticipated 
rejection may be needed when the bijection involves, on the RHS, a combina¬ 
torial object with less than d marks. In some cases, the missing marks can be 
resampled uniformly without introducing any bias, while in other cases this is 
not possible, and the growing object has to be rejected. 

The size at each algorithmic step t changes by a random value it € I. This 
happens asymptotically with probabilities If the drift is positive, the 

size makes a directed random walk with a positive slope, which, with high 
probability, either intersects n after ^ n/S + 0{y/n) steps, or hops over this 
value and goes towards infinity. Thus, if anticipated rejection is required, with 
exponent a in the appropriate range, we are in the context of the geometric 
convolution of the Darling-Mandelbrot distribution discussed at the beginning 
of the section. 


If / C N"*" (we say that 1 is non-backtracking in this case), the walk either 
passes by n exactly once, or misses it; asymptotically, this happens with prob¬ 
ability 1/5 and 1 — 1/5, respectively (provided I is aperiodic, that is, has no 
common divisor > 1). If the value n is missed, we shall restart the algorithm. 

If I has support on both positive and negative integers (and thus is back¬ 
tracking), the walk may intersect n more than once, and the first hit of n 
may occur after that larger values have been reached. This makes the opti¬ 
misation and analysis of the algorithm slightly more complicated. Any of the 
hitting events gives an unbiased sample, and a concrete algorithm will just 
take the first one. Having a positive number of hitting events happens now 
with probability smaller than 1/5, but still 0(1) (the exact asymptotic prob¬ 
ability involves a complicated expression in the pfs, an analysis postponed to 
the following paragraphs). At any time, possibly in light of the current size 
parameter, we have the right of restarting the algorithm. Restarting as soon 
as a value higher than n is attained is a feasible choice, but non-optimal by a 
constant factor in complexity, as at values n' = n -I- 0(1) we still have a prob¬ 
ability 0(1) of hitting n in 0(1) further steps, that largerly pays off against 
the expensive restart procedure. It is more efficient to restart the algorithm 
as soon as we can conhdently suppose that, with high probability, the walk 
has reached a size larger than n for never coming back. Based on the univer¬ 
sal behaviour of drifted one-dimensional random walks, a generic simple such 
strategy, which is asymptotically optimal, is to restart as soon as the current 
size reaches n -I- y/ri. 
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We now analyse the probability of having a positive number of intersec¬ 
tions, in the backtracking case. The asymptotic probability tTs that there are 
s intersections has the form ttq = a, and tTs = hd^~^ for s > 1. This is due 
to the fact that bridges at height n are independent events, and are thus con¬ 
catenated geometrically. The resulting convoluted distribution is thus T^ia, a), 
and we shall determine a. 


Normalisation gives a-|-6(l —c)“^ = 1. The average number of intersections 
is 6(1—c)“^, and must be given by 6“^. However, these two trivial equations are 
not sufficient to determine a, and we need a third relation. We can determine 
the average of (*), divided by the average of s, which is c/(l — c). In fact, 
the latter is represented combinatorially by a path crossing n, in which one of 
the crossings is marked, and the former is the analogous event in which two 
crossings are marked. The two semi-infinite parts of the walk have analogous 
distributions in the two processes, and the part of the walk between the two 
marks is a random bridge, independent from the rest of the path. So, this 
accounts to evaluate the generating function of bridges, at criticality, for the 
asymptotic step rates PiC~^- Such a quantity is written as a Cauchy integral 
involving the kernel (Laurent) polynomial, K{uj) = and is written 

in terms of the residues at those roots of the polynomial, which are series in 


the parameter uj with no Laurent part (called small roots, see (Banderier and 


Flajolet, 2002)). 


It is legitimate to ask whether there exist concrete applications in which 
the set / described above is backtracking}^ A detailed discussion of this point 
would be besides the scope of the present article. Let us however provide a 
simplistic example, of a recursion in which Zn is a rational series, satisfying 
a holonomic equation with constant coefficients. The example shall illustrate 
how backtracking recursions may arise easily from small modifications of non¬ 
backtracking problems, preserving the probabilistic interpretation of the asso¬ 
ciated generating series. It is well known that Fibonacci numbers satisfy the re¬ 
cursion Fn = Fn-i + Fn-2, with Suitable initial conditions Fg = 0 and Fi — 1. 
Such a recursion has set / = {1,2}, thus it does not provide an example of the 
form we seek. These numbers can be refined to integer-valued polynomials, e.g. 
as F)) = F^_i + xF^_ 2 , or as F" = xF”_i + xF ”_2 (in the two cases, again 
for suitably chosen initial conditions, the polynomials are trivially related: 
F”{x) = x"F))(a;“^)). In our prospective of exact sampling, n is the size, and 
x G M’*’ is a parameter in the measure on the associated combinatorial objects 
(Fibonacci trees, or dimer-monomer configurations on an interval). Combining 
the equations at two consecutive sizes, we have (1 -I- x)F^ = F^_^_i + x^F))_ 2 , 
and {l + x)F” = F"+i + xF”_ 2 , in the two cases. For x G K"*", both these equa¬ 
tions have set I = {—1,2}, and are thus backtracking. Of course, we cannot 
be satisfied with these examples either: these quantities satisfy also the sim¬ 
pler customary Fibonacci relations, which provide a simpler, non-backtracking 
implementation of the sampling algorithm. In other words, both of the as- 


® This question has been posed also by the anonymous referee. 
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sociated polynomials, 1 — u>(l + a:) + w^x = (1 — w)(l — wx — lu^x) and 
1 — ■u;(l + x) + w^x^ = (1 — wx){l — w — w'^x), factorise. 

Consider now any convex combination of the two relations, e.g. 

2(1 + x)Fn = 2Fn+l + x{x + l)Fn-2 (17) 

which has associated polynomial 1 — w{l + a;) + ^w^x{x + 1 ), that is not 
factorisable. Still, under suitable initial conditions (such as Fg = 0, Fi = 1, 
F 2 = 1 + a;), the polynomials 2 "'F„(a;) have positive integer coefficients, with 
a potential combinatorial interpretation!^ 


4.5 Random walks in conical domains 


In this section, we study models of constrained random walks. The complexity 
of the anticipated rejection algorithm is governed by the survival probability 
of the model, that is, the probability of a random walk of length t to satisfy 
the constraints. The analysis of survival probability for this class of problems 
has a long history, that dates back at least to Sommerfeld at the beginning of 
the century. A review of results can be found in (Redner, 2001 Chapter 7), 


and Wachtel 


and a modern approach with a rigorous derivation can be found in (Denisov 

2015| n 


The first case we describe is random walks in the square lattice constrained 
to remain in a wedge of angle 9. As explained in ( Redner[ 2001, Section 7.2), 
the survival probability satishes in this case F{t) ^ ct~°‘ where 29a = tt. 
An identical result holds for other regular lattices (such as triangular, hexag¬ 
onal,. .. )|^ This gives an exponent a ranging from 1/4 (for excluding just a 
half-line) to arbitrarily large (for a narrow wedge); however, arbitrarily small 
values of a can be found by considering values of 9 greater than 27r, by taking 
into account the winding number of the walk. In particular, we find: 

— for 9 > 7 r/ 2 , the algorithm has linear average complexity and limit law 

His)- 

— for 6 = 7r/2, the algorithm has average complexity nlogn and exponential 
limit law; 

— for 9 < 7 r/ 2 , the algorithm has average complexity and an expo¬ 

nential limit law. 


® The positivity property still holds for the homogeneous, more refined polynomials asso¬ 
ciated to the equation 

{2x-\-y ^ z)Fn = Fn+i -h x(y -h z){2x + y + z)Fn-2 
Fo = 0] Fi = 1- F 2 = x-\-y. 


We thank M. Bousquet-Melou and K. Rashel for pointing out this reference. 


20011, but it could be 


^ This is not discussed in the synthetic presentation of (Redner _ 

derived on identical basis, and it is implicit in the large generality of the results in ( [Deniso^ 
[and Wachtel[[2015[ |. 
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Fig. 3 Left: example of Kreweras’ walk. Right: example of Gessel’s walk. 


For specific values of 0, these walks can be realized as walks in the quarter 


plane with some prescribed steps (Bousquet-Melou and Mishna 2010). For 
instance, Gessel’s walks, with steps {i/, f—, y, —>■}, correspond to walks in 
the square lattice in a wedge of angle 37r/4. The variant of Kreweras’ walks 
with steps {I, j/, f—, t, y*, —t} correspond to walks in the triangular lattice in a 
wedge of angle 27r/3. Similarly, the other two variants with steps {J,, ■<—, /^} and 
correspond to walks in the same wedge, for the oriented triangular 
lattice, in which the edges are oriented in an alternating way around each 
vertex, the two families of walks corresponding to the two possible orientations 
(see Figure]^. The anticipated rejection algorithm thus has linear complexity 
in both cases, with respective limit laws I’d) (Gessel) and I’d) (Kreweras, 
in the three realisations). 

A more complicated case is random walks in constrained in a cone 
defined by d < 0max (in spherical coordinates). In this case, the survival 
probability is F{t) ~ ct~''/‘^^ where v is the smallest positive number such 
that Pv{cos9raa^) = 0, where is the Legendre function (Redner 2001, Sec¬ 
tion 7.3). This allows for the exponent a = j^/ 2 to be any positive number. 
In particular, since P 2 (x) = (3x^ — l)/2, an exponent a < 1, and thus a 
linear-time algorithm, is achieved for ^max > arccos(l/-\/3). 

In fact, generic cones in generic dimensions, and for a large class of pe¬ 
riodic lattices, can also be handled in this way. Full details can be found in 
(Denisov and Wachtel 2015), and, in particular, their Section 1.2 illustrates 


the required precise technical assumptions. Let us summarise in few words 
these hypotheses. There are four of them. The first two are mild requests on 
the shape of the cone, which in particular are automatically satisfied in di¬ 
mension 2. A third hypothesis allows for long-range walk steps, provided that 
certain moments are finite (we only considered walks with finite-range steps 
here). A fourth hypothesis requires that the associated unbounded random 
walks undergo isotropic diffusion, and always holds in absence of drift (as we 
require here for having non-trivial asymptotics), up to applying an appropriate 
affine transformation to the lattice. 

Let I? be a cone of and let l7o = 17Under the conditions detailed 
in the reference, the survival probability in the cone 17 satisfies: 


F{t) ~ 
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where v is the smallest positive number such that there exists a function hi,{0) 
on the unit sphere vanishing at the border of J7o and satisfying: 

Ash^{9) =-Xh^{6), X = i'{i'+ d-2), 

where As is the spherical Laplace operator. 

Thus, we are again in the conditions of Theorem with a = v 12. The 
exponent v is, however, difficult to compute in general. 


4.6 More complex random walk problems 


In the previous section we considered random walks on a lattice that shall avoid 
some “wall” prescribed deterministically. Here we consider a more complex 
problem in which the growing structure produces the walls dynamically. Say 
that two paths on a graph intersect if they share some vertex. We have two 
classes of problems: (PI) a walk lu of length n, starting at a neighbour of 
the origin, such that there exists some infinite walk connecting the origin to 
infinity and not intersecting uj. (P2) for k > 2, fc-tuples of walks (wi,... , 0 ;^), 
of length n, starting from nearby vertices (e.g., aligned along a line), that shall 
not intersect each other. 

For undirected random walks, the simplest lattice is Z^, i.e., with the 
2D possible steps {s“} = {(0,0,..., ±1,..., 0)} uniformly chosen. The ana¬ 
logue for directed random walks is N x Z^“^, i.e., with the 2{D — 1) possi¬ 
ble steps {s“} = {(1, 0,..., ±1,..., 0)} uniformly chosen. More generally, we 
may consider unbiased isotropic (undirected) random walks, i.e., walks that 
can perform steps s“ G Z^ with weight Wa, such that J2a'^asf = 0 for all 
i = 1, ..., D and Wasfs'j = CSij. In the directed variant we have s“ = 1 
for all steps a, and all other compatible constraints are left as are. 

The associated exponents, when non-trivial {i.e., for D sufficiently small), 
are in general hard to evaluate. For directed walks in D = 2, (PI) is trivial, and 
(P2) is called vicious walkers. The well-known relation with classical ensembles 
of random matrices gives a = k{k — l)/4 in that case. This means that we 
have no problems in the interesting range 0 < a < 1 , except for k = 2, which, 
on N X Z, reduces to prefixes of Dyck paths through a simple bijectior0 

For undirected walks in D = 2, conformal invariance, and even better 
the connection with the exactly solvable analysis on random planar graphs 
via KPZ relation (Knizhnik et al. (1988)), have led to the determination of 
a variety of critical exponents, which have been proven subsequently by SLE 
techniques (see Duplantier ( 1998|) for the original conjectures, and|Lawler et al. 


( 2001a|b||2002 ) for the proofs). As shown in Lawler et al. ( 2001b[ ), we have a = 
^(4fc^ — 1), in a unified formula for (PI) (using fc = 1) and for (P2)j^Thus we 


® It is worth noting that, still on N X Z, and at generic k, in the variant in which the end- 
points are prescribed, exact enumeration formulas allow for an efficient algorithm, involving 
no anticipated rejection (see ( |Bonichon| |2002l Chapt. 4)). We thank the anonymous referee 
for pointing us towards this reference. 

Incidentally, note that also in the directed case the formula for a{k) matches with the 
trivial value o; = 0 for problem (PI). 
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Fig. 4 A random sampling of two walks, each of length n = 2000, on Z^, starting at 
neighbouring points and avoiding each other. 


have two new problems in the interesting range: problem (PI), following the 
law 2?(|), and problem (P2) with fc = 2, following the law r’(|). An example 
of the latter is in Figure]^ 
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